DUALITIES IN PERSISTENT (CO)HOMOLOGY 

VIN DE SILVA, DMITRIY MOROZOV, AND MIKAEL VEJDEMO-JOHANSSON 

Abstract. We consider sequences of absolute and relative homology and cohomology groups that 
arise naturally for a filtered cell complex. We establish algebraic relationships between their per- 
sistence modules, and show that they contain equivalent information. We explain how one can use 
the existing algorithm for persistent homology to process any of the four modules, and relate it to 
a recently introduced persistent cohomology algorithm. We present experimental evidence for the 
practical efficiency of the latter algorithm. 
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1. Introduction 

oo 

The subject of inverse problems deals, fundamentally, with the inference of shape. From some 

! , related measurements — such as a family of particular path integrals — we try to deduce geometric 

f-H information. With the classical techniques in the field, with Fourier and other integral transforms, 

-^ one can deduce an impressive amount of information. However, with non-linearity, and ill-posed, 

ill-conditioned situations, the classical methods need increasingly large amounts of regularization 
or data cleaning. Topology offers a family of methods that allow the inference of information - 
if not geometric, then at least topological — into the field. In particular, the recent development 
of persistent homology [1], and its applications to topological data analysis [2], demonstrate an 
approach to topological invariants that becomes applicable to high-dimensional, finite and discrete 
measurement sets. 

To take an explicit example, geological sonar investigations employ inverse problem methods to 

investigate the geometric structure of the density sublevel sets in subterranean domains, relating 

^>0 density variations to occurrences of oil, water or mineral pockets. The kind of information sought 

. starts out with a qualitative judgement: is there a pocket at all; are there several or few; are 

they connected or not? These first questions, before the shape can be given an explicit geometric 

description, are a matter of topological properties, and the study of sublevel sets of functions on 

y— i domains is one of the most convincing uses of persistent homology. 

The persistent homology algorithm of Edelsbrunner, Letscher, and Zomorodian PQ is now ten 
years old. In its natural general form [3J, the input is a filtered 'space' (topological space, or 
simplicial complex, or abstract chain complex) and the output is a collection of half-open real 
intervals known as a barcode or a persistence diagram. 

These barcodes contain one bar for each topological feature found - one bar for each homology 
class, representing a hole or a higher-dimensional void. These bars come with a starting point, 
indicating the focal level at which the feature first becomes visible, and an ending point, indicating 
the focal level at which the feature vanishes again. A fundamental tenet, as described in [2] is that 
the length of such a bar - the difference between when it shows up and when it vanishes - encodes 
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the relevance of the feature. This emphasizes the topological features that are enveloped by a dense 
distribution of points, and yet have a geometrically large void in the middle. 

In many applications, all that is required is the barcode. This tells us how many homological 
features exist at any given level of the filtration, and how many of those survive to any given sub- 
sequent level. This information is already very rich, and has been proven to be statistically robust 
[H [5] . Sometimes more is required. The most common request is for geometric representatives 
of the features: in other words, explicit homology cycles representing each barcode interval. The 
original algorithm provides these cycles automatically: they are essential to the way in which the 
barcode intervals are calculated. 

In fact, there are at least four natural persistent objects that can be derived from a filtered space. 
They are: 



persistent i absolute \ I homol °gy 1 

I relative | [ cohomology J 



The 'standard' object is persistent absolute homology, and most treatments focus on this. However, 
it has become increasingly clear that the other three objects are important in their own right. The 
transition between homology and cohomology is in some sense nothing more than the duality of 
vector spaces; persistent homology and cohomology have the same barcodes. However, homology 
cycles and cohomology cocycles are quite different, and some applications call for cocycles rather 
than cycles jB] . The occasional utility of relative rather than absolute homology is probably easier 
to grasp intuitively; for example see [7] for an application in sensor networks. It is easy to 'fake' the 
calculation of relative homology using absolute homology and a cone construction, but we point 
out that this trick is unnecessary. 

Our goal in this paper is to provide a streamlined approach to calculating barcodes and (co)cycle 
representatives for all four persistent objects. We discuss this approach in terms of abstract algebra 
and in terms of matrix computations. 

We observe that: 

• absolute homology and cohomology have the same barcode; 

• relative homology and cohomology have the same barcode; 

• the absolute barcode and the relative barcode can be deduced from each other; 

• the cycles and bounding chains of persistent absolute homology determine, and are deter- 
mined by, the cycles and bounding chains of persistent relative homology; 

• likewise, for absolute and relative cohomology cocycles and bounding cochains. 

We discuss two different dualities. There is the standard duality which interchanges homology 
and cohomology. We call this 'pointwise' duality. More interestingly, there is a different duality 
which makes the following interchange: 

absolute homology «-»■ relative cohomology 
absolute cohomology o relative homology 

We call this 'global' duality, and it appears only in the context of persistent topology. Global 
duality 'commutes' with all possible algorithms and theorems: a method for calculating persistent 
absolute homology will equally well calculate persistent relative cohomology, once the input data 
have been turned upside-down in a particular way. 

Combining all of these equalities and dualities, it emerges that a single calculation (run twice) 
suffices to calculate all four persistent objects. Actually, we describe two different algorithms for 
that calculation: pHcol (the 'column algorithm') and pHrow (the 'row algorithm'). Here pHcol is 
essentially the classic algorithm of [U [3]; pHrow organises the calculation quite differently. The 
preferred choice depends, in any given situation, on whether it is easier to look up rows or columns 



of the boundary matrix of the filtered space -- the specific representation of the space usually 
biases this choice. 

We are rewarded by an unexpected payoff. If we require only the absolute barcode, it turns out 
that the best choice is an optimised version of pHrow called pCoh (the 'cohomology algorithm'). We 
give experimental evidence to this effect. Standard practice has been to use pHcol. We therefore call 
on persistent topology library- writers to implement pCoh, and on persistent topology library- users 
to use it. 



1.1. Outline of paper. Section [2] is devoted to the algebra underlying this work. In |2.1f|2~5l we 
conduct the discussion at a high level (homology functors are assumed given, black-box style), and 



m 



2.6-2.7 we go into the necessary chain-level details. In 2.8 we give a brief abstract description 



of the two dualities. 

Section [3] is about matrix algorithms. In 3.1-3.4 we interpret the preceding algebra in terms of 
matrix decomposition (following [8j, again black-box style). In 3.5 we present the two algorithms, 
pHcol and pHrow, and explain why they give the same output. 

In Section [4] we relate the ideas in this paper to an earlier cohomology algorithm pCoh published 
in [6]. We indicate why we expect pCoh to be faster that pHcol and pHrow for computing barcodes 
of filtered simplicial complexes, and we verify this by experiment. 



2. Algebra 

We will assume that the reader is familiar with homology theory. Our preference is to use cellular 
homology, because it is a little more general than simplicial homology. 

2.1. Coefficients. Individual (co)homology groups are defined with coefficients in a field k, which 
remains fixed throughout this paper. Persistent (co)homology then has the structure of a graded 
module over the polynomial ring k[i]. Many things go wrong when we replace the field k with a 
ring, in particular the ring of integers Z. See [3]. 

2.2. Filtered complexes. We are interested in the persistent topology of filtered topological 
spaces. The simplest example is a filtered cell complex, which is a sequence X of cell complexes 



(2.1) 



X\ C X2 c • • • c x n = x c 



where X\ is a vertex <ti, and thereafter each complex is obtained from the previous one by adding 
a single cell: Xi = -X$_i U U{. Here the index set is {1, 2, . . . , n}. Usually we attach real values aj 
to the indices, which must satisfy a\ < ai < ■ ■ ■ < a n . 

Example. Our running example S will be a cellular filtration of the 2-sphere: 




There are six cells, <7\, . . . , a% which appear at times aj = i, for i = 1, . . . , 6. 

2.3. Persistent homology. If we apply a homology functor H(— ) to a filtered complex X we 
obtain a diagram: 



(2.2) 



H( 



H(Xi) -»• H(X 2 ) 



H(X n ) 



Typically H(— ) denotes the &;-dimensional homology Hfc(— ; k) or the total homology H*(— ; k). Then 



(2.2) is a diagram of finite-dimensional vector spaces and linear maps, also known as a persistence 
module. 

A persistence module decomposes as a direct sum of interval modules |3j. These are labelled 
by ordered pairs of integers [p, q], where 1 < p < q < n. The pair \p,q] indicates a feature which 
persists over the index set {p, . . . , q}. We frequently interpret [p, q] as the half-open real interval 
[a p ,a q +i), with the convention that a n +i = oo. 

The persistence diagram or barcode is the multiset of ordered pairs [p, q] in the decomposi- 
tion, or alternatively the multiset of half-open intervals [a p , a g +i). Thus we write: 

Pers(H(X)) = {[pi,gi],...,[p m ,g m ]} 

= {[a Pl ,a qi+1 ) , . . . ,[a Pm ,a qrn+ i)} 

It is customary in applications to discard from the persistence diagram those intervals [a p , a q +i) 
for which a p = a q +\. 

Example. In our running example, the intermediate spaces Si, S3, S5 are all contractible, whereas 
S2,S4,S6 are homeomorphic to the 0-sphere, 1-sphere, and 2-sphere, respectively. There are four 
intervals in the persistence diagram of H*(S): 

Pers(H*(S)) = {[1, 6] , [2, 2] , [4, 4] x , [6, 6] 2 } = {[1, oo) , [2, 3) , [4, 5)i, [6, 00)2} 

The subscript k in [p, q] k or [a p , a g +i)fc indicates that the feature occurs in /c-dimensional homology. 

2.4. The four standard persistence modules. The standard persistent homology module H*(X) 
tells us how the absolute homology groups H*(Xj) relate to each other as i varies. We can play the 
same game with the absolute cohomology groups H*pQ), the relative homology groups H*(X n , Xj), 
and the relative cohomology groups h\*(X n ,Xi). Here are the four sequences, lined up for compar- 
ison. 



H*(X) 




H*(Xi) -» . 


. — > H ! „(X n _i) — > Hh,(X, 


H*(X) 




H*(Xi) <- . 


. ^H*(X n _!) ^H*(X, 


H*(Xoo, X) 


H*(X n ) 


— > H*(X n ,Xi) — > . 


■ —> V\*{X n , X n ^\) 


H*(Xoo,X) 


H*(X n ) 


^H*(X„,Xi)^ . 


. <r- \-\*(X n ,X n -\) 



The persistence diagram for absolute cohomology is a multiset of integer ordered pairs [p, q] with 
1 < p < q < n. For relative homology and cohomology, the persistence diagrams are multisets 
of pairs \p,q] with 0<p<q<n — 1. In all cases, we interpret [p, q] as the half-open interval 
[a p ,a q +i), with the convention that ao = — °° an d a n +i = °°- 

Example. In our running example, we compute 

Pers(H,(S 6 , S)) = {[0, 010,12,2]!, [4, 4] 2 ,[0,5] 2 } = {[-00, l)o, [2, 3)i, [4, 5) 2 , [-oo,6) 2 }. 

For instance, at index 2 we note that there is a nontrivial element of Hi(Sg, S2) represented by any 
arc connecting the two points of S2. To be specific, the homology class is [173] = [04]. This class 
vanishes in Hi(Se, S3), and so it generates the interval [2,3). 

The reader may detect a relationship between the barcodes for absolute and relative homology. 
We formalize this in the next section. 



2.5. Barcode isomorphisms. 
Proposition 2.3. For all k, 

Pers(H fc (X)) = Pers(H fc (X)), 

Pers(H fc (X 00 ,X)) = Pers(H fc (X 00 , X)). 

In other words, homology and cohomology have identical barcodes. 

Proof. The universal coefficients theorem [9, Thm 3.2] asserts that there is a natural isomorphism 

H fe (X;k) = Hom(H fe (X;k),k). 

In other words, cohomology and homology are dual as vector spaces, and hence have the same 
dimension. 'Natural' implies that the induced maps 

H k (X f ,k) ^ H k (X f ,k) and H k {X l ; k) <- H k (X f , k) 

are adjoint, and hence have the same rank. Because of the way the barcode is uniquely determined 
by dimensions and ranks, it follows that the absolute homology and cohomology barcodes are the 
same. This argument applies equally well to the relative barcodes. □ 

Notation. We partition each persistence diagram into two parts, 

Pers = Perso U PerSoo, 
where Perso comprises the finite intervals [a, b), and PerSoo the infinite intervals [a, oo) or [—00,6). 
Proposition 2.4. For all k, 

Pers (H fe (X)) = Perso^^^X)), 
Per Soo (H fe (X)) = Pers 00 (H fe (X 00 ,X)), 

where the second 'equality' is interpreted as a bijection with [a, 00) -H- [—00, a). Thus, persistent 
homology and relative homology barcodes carry the same information, with a dimension shift for 
the finite intervals. 



The proof appears in Section 2.6 



Remark. Thus, provided we take the dimension shifts into account, all four barcodes carry exactly 
the same information. If we are only interested in barcodes, we can perform calculations in any 
one of the four basic sequences, whichever is the most convenient. 

Since the last term of H^(X) is the same as the first term of H^(X 0O ,X), namely Hfc(X n ), the 
two sequences can be concatenated into a single sequence, which we denote Hfc(X) — > H/^Xqo, X). 
The index set for this sequence is {1, 2, . . . , n = 0, 1,2, ... ,n — 1}, where we use barred numerals 
to indicate that we are in the relative homology part of the sequence. The persistence diagram for 
this complex will have intervals of three possible types: 

• (P) a ) where 1 < p < q < n, written as [p, q + 1) or [a p , a q+ \) in interval form. 

• (P) <j) where 0<p<q<n — 1, written as \p, q + 1) or [a p , a q+ \). 

• (P) q) where 1 <p < n, < q <n— 1, written as [p, q + 1) or [a p , a q+ i). 

Proposition 2.5. The barcode Pers (H/%(X) — > Hfc(X OC) ,X)) comprises the following collection of 
intervals: 

• An interval [a,b) for every interval [a,b) in Perso(Hfe(X)). 

• An interval [a, b) for every interval [a, b) in Perso(H/ c _ 1 (X)). 

• An interval [a, a) for every interval [a, 00) in Pers 00 (H / ! c (X)). 



Proof. Note that the first two classes of interval in Pers(Hfc(X) — > H/^ATqcX)) are those which 
do not meet the middle term H k {X n ), and thus correspond exactly to finite intervals in 
Pers(Hfc(X)) and Pers(Hfc(X oc , X)). This explains the first two cases, once we make the translation 
Perso(H fe (X 00 ,X)) = Pers (H fe _ 1 (X)). 

It remains to show is that the intervals of type [a, b) are always of the form [a, a) II To do this, we 
need to compare the right filtration of the sequence Hfc(X) with the left filtration of the sequence 
Hfc(X 00 , X). The first filtration is the nested sequence of subspaces 

Im(H fe pQ) -> H k (X n )), i = l,2,...,n-l, 

of H k (X n ), and the second filtration is the nested sequence of subspaces 

Ker(H k (X n ) -> H k {X n , X<)) i = 1, 2, . . . , n - 1, 

of Hfe(-Xn). But the image and kernel subspaces are equal for each i, by the homology long exact 
sequence for the pair (X n ,Xi). Thus the nitrations are the same. □ 

Remark. The sequence H^(X) — > H/ C (X 00 ,X) is not the same as the extended persistence [10] . The 
latter, defined for the sublevel sets of a real- valued function, requires the reversal of the cells in the 
relative half of the sequence — it translates into the use of the superlevel sets of the function. The 
meaning of extended persistence for a general filtered space is a lot less straight-forward. The most 
significant difference between the two sequences (besides the definition) are the extended pairs, 



the intervals corresponding to [a, oo) in PerSoo(Hfc(X)). In Proposition 2.5 they become the trivial 
intervals [a, a); on the other hand, they are the main reason extended persistence was introduced: 
these pairs carry the new information. Another notable difference is that the dualities in this paper 
apply to general filtered spaces; Poincare and Lefschetz dualities involved in the analysis of the 
extended persistence require the domains to be manifolds. 

2.6. Persistent chain complexes. We now give a more explicit description of the standard per- 
sistence modules, in terms of chain complexes. Among other things, this will lead to a clean proof 



of Proposition 2.4 Given a filtered cell complex X = o~\ U • • • U a n , define a persistence module 

C : d ->• C 2 -> ► C n 

where C, = (<ti, . . . ,o~i), the vector space over k with basis elements labelled o\, . . . ,o~i- We also 
have a boundary map: the boundary of any o~j is a linear combination of cells which appear 
previously: 

d(T 3 = Yl D iJ a i 

i<j 

for some collection of coefficients Dij. Geometrically, the cells <Tj for which D^ ^ will have 
dimension one less than the dimension of Uj. 

The boundary map satisfies d 2 = 0, and it restricts to boundary maps 9$ : Cj — > C{ for each i. 
Then C*(Xj) = (Ci,di) is the chain comple:x[j for the absolute homology of Xi, and C*(X) = (C,<9) 
is the persistent version for X. Accordingly, we define the persistent absolute homology of X to be 

H*(X) = H(C,d) = Ker(d)/Im(d). 



Thus the paired intervals [a, oo) and [— oo,a) in Pers oc (Hfc(X)) and Pers 00 (Hfc(X 00 , X)) are really the restrictions 
of a single interval [a, a) in the concatenated sequence. 

For simplicity we generally suppress the homological grading within each d, which conies from the geometric 
dimensions of the cells associated to the generators. We will refer to it only when necessary. 



Shown explicitly as a persistence module, this is: 

Ker(3i) Ker(<9 2 ) Ker(d„ 



H(C,0): 



lm(di) Im(<9 2 ) Im(<9 n ) ' 

For the absolute cohomology persistence module H*(X), we define 

C* : C\ <- C* 2 <- . . . <- C* n 

where C* = Hom(Cj, k) = {a\, a^, ■ ■ ■ , cr*), with {of} being the dual basis to {<7j}. The coboundary 
S = d* is defined to be the adjoint to d. Then C*(X) = (C*,5) and 

H*(X) = H(CV) = Ker{5)/Im(8). 

Again, this is a persistence module (with arrows to the left). 

Example. In our running example, the boundary map is given as follows: 

do\ = do2 = 0, 

(9(73 = <9(7 4 = 01 - 02, 

da 5 = 9(7 6 =03 — 04. 

This information is recorded in matrix D of Figure [TJ The coboundary map is given as follows: 

<5<7* = —(5(72 = 03 + 04, 

<*°3 = - 5(7 4 =(7 5 + °6i 
5(75 = 5(7% = 0. 

This information is recorded in matrix D of Figure [l| 

The relative homology and cohomology persistence modules are defined as the homology of the 
persistence modules 

(C„/C) : C n ->• (C n /Ci) -> (C n /C 2 ) ->...-)• (C n /Cn-i) 

(C n /C)* : C£ <- (C n /Ci)* <- (C„/C 2 )* <-...<- (C n /C n _i)* 

with boundary and coboundary maps induced from 0, 5 in the natural way. Thus 

Cpr^x) = (a/c, a), epr^x) = ((c n /c)*,«j). 

H,(Xoo,X) = H(C n /C,0), H*(Xoo,X) = H((C„/C)*,<5). 

Remark. We note that the maps — > of C and the maps <— of (C n /C)* are injective, whereas the 
maps <— of C* and the maps — > of (C n /C) are surjective. In other words, absolute homology and 
relative cohomology are structurally akin to each other; and qualitatively different from absolute 
cohomology and relative homology. This is a symptom of the global duality mentioned in the 
introduction. 

The main theorem of [TJ |3] can be restated in the following way. 
Theorem 2.6. Given C,d as above, there exists a partition 

{l,2,...,n} =FuGuH 
with a bijective pairing G f* H , written as follows: 

g is paired with h 44> [g, h) S Pairs = Pairs(C,<9). 
Moreover, there is a new basis a\ , &2, ■ ■ ■ , o n of C n such that: 
(1) d = (ai,... ,&i) for all i. 



(2) da f = for all f eF. 

(3) d&h = & g , and hence da g = 0, for all [g, h) G Pairs. 

It follows that the persistence diagram Pers(H(C, <9)) consists of the intervals [af,oo) for f £ F 
together with the intervals [a g ,ah) for [g,h) G Pairs. □ 

We note that item (1) is equivalent to the assertion that the leading term of each <7j is <Tj (up to 
a nonzero scalar). 

In the language of [I], the index set F identifies the positive simplices which remain unpaired, 
the index set G identifies the positive simplices which do get paired, and the index set H identifies 
the negative simplices. The vectors <5/ and a g are the cycles with leading terms Of and a g , and 
the vector &h is the chain with leading term c^ which 'kills' the homology class of its paired a g by 
means of the equation dah = a g . 

Example. In our running example, F = {1,6}, G = {2,4}, H = {3,5} and Pairs = {[2,3) , [4,5)}. 
The new basis is 

U\ = <7i, 

<5"2 = -02 + 0"1, 

The reader can easily verify that da\ = d&2 = da^ = da§ = 0, that do-3 = <5"2, and da^ = 04. 



0"3 = OS, 


05 = 0-5, 


(74 = -CT4 + <7 3 , 


0"6 = O-Q- 0-5. 


<9<5"2 = <9<5"4 = <9o"6 = 


= 0, that 5o"3 = <5"2, and do§ 



Proof of Proposition 2.4 The decomposition {1,2, ...,n} = F \J G U H and the new basis 
o"i, o"2, • • • , o n allow us to express C as a direct sum of persistent chain complexes: 

C = © C / © © C ^ 

feF [g,h)ePairs 

where Cf = (aj) and C 9t h = (o- g ,ah). Moreover, the boundary map d respects this decomposition, 
mapping each summand into itself. We can therefore calculate H((C n /C), <9) on each summand 
separately. 

For summands of type Cj, the persistence modules are constant over two phases, with index 
ranges {0, ...,/ — 1} and {/,..., n — 1}. We can condense this information by representing them 
as two-term persistence modules (one term for each index range): 



((C/)„/C / ) 


(a f ) -)■ 


Ker(<9) 


(& f ) -)• 


Im(<9) 


^0 


Ker(<9)/Im(<9) 


(& f ) -)• 



H 

It follows that H((C/) n /C/) contributes the interval [—00, a/). This is generated by [aj] and hence 
has the same homological dimension as [a/, 00) in Pers(H(C)). 

For summands of type C 9t h the persistence modules are constant over three phases, with index 
ranges {0, . . . ,g - 1}, {g, , 



. , h — 1} and {h, . . . 


,n-l}: 


((Cg,/i)n/C gj / l ) 


fig, ah) -> {o-h) ->■ 


Ker(d) 


(°g) ->■ (Zh) ->• 


Im(<9) 


(a- ff ) ^0 ->. 


H =Ker(d)/Im(<9) 


-^ (* h ) ->. 



It follows that H((C ffi / l ) n /C Si /i) contributes a single interval, [a g ,a,h)- This is generated by [ah], and 
hence has dimension one greater than [a g ,a,h) in Pers(H(C)), that being generated by [a g ]. □ 



The following table summarises the relationship between the three types of generator and the 
persistence intervals they generate. 



generator 


a f 


°9 


&h 


absolute homology 


[a/,oo) 


[a g ,a h ) 




relative homology 


[-00, a f ) 




[a g ,a h ) + 



The homological dimension of each interval is equal to the homological dimension of its generator. 
So, for any pair [g,h) £ Pairs, the dimension of [a g ,ah) in H*(X 00 ,X) is one greater than the 
dimension of [a g ,ah) in H*(X). We indicate this in the table with a + subscript. 

2.7. Cohomology. Persistent relative cohomology is structurally similar to persistent absolute 
homology. To make this apparent, let us introduce new notation, writing 

C x : Ct -» Ci -»■ . . . -»• <# 

for the reverse of the sequence 

(c n /cy : C* <- (Cn/dT <-...<- (C n /C n _i)*, 

SO Ci = {CJCn-i)*. 

Recall that a*, . . . , o* n denotes the basis of C* dual to the basis a\, . . . , a n of C n . If we write 



<+l_i, then 



Q — (C n /C n -i) — (a n ,a n 



n—li 



1 a n+l- 



{n,T 2 , 



by elementary linear algebra. Moreover, if doj = Yli<j ^ij a % then 5tj - 
On account of the formal similarity between C and C we conclude: 






^2^a DjiTi. where Dj 



Corollary 2.7. Suppose we have an algorithm which takes as input the sequence of cells o~i, . . . , o~ n 
and their boundaries do~\, . . . , da n and produces as output the persistent absolute homology of X = 
0"! U ••• Ucr n . 

Then the same algorithm applied to the sequence of formal cells n,...,T n and coboundaries 
Sti, . . . ,5r n computes the persistent relative cohomology o/X. □ 



is the reverse ofC*, and the respective cobound- 

□ 



Persistent absolute cohomology can be thought of as 'relative persistent relative cohomology'. 
More precisely, by elementary linear algebra: 

Proposition 2.8. The persistence module (C^/C 
ary maps agree. 

Corollary 2.9. Suppose we have an algorithm which takes as input the sequence of cells o~i, . . . , a n 
and their boundaries do~\ , . . . , da n and produces as output the persistent relative homology of X = 

(71 U ••• Ucr n . 

Then the same algorithm applied to the sequence of formal cells ri,...,r n and coboundaries 
<5n, . . . , 5r n computes the persistent absolute cohomology o/X. □ 

Remark. We must transcribe the indices correctly for these two corollaries. If such an algorithm 
applied to the tj,o"tj produces a persistence interval \p, q— 1] for C -1 then this is equivalent to 
[n + 1 — q, n — p] for (C n /C)* ; and hence to the half-open real interval [a n +i-q, a n +i-p)- 

Suppose now we apply Theorem 2.6 to C -1 to obtain a partition {1, 2, . . . , n} = R U S U T and 
new generators fj, with 5f r = 0, St s = 0, and 5ft = f s for pairs [s,t) € Pairs(C _L ,(5). Then we 
obtain the following table: 



10 



generator 


T r 


T s 


h 


relative cohomology 


[-00, dn+l-r) 


[O-n+l-t, 1n+l-s) + 




absolute cohomology 


[a„ + i_ r ,oo) 




[O-n+l-ti a n+l-s) 



By considering Proposition 2.3 we deduce that 

R = n + 1-F, S = n + 1 



H, T = n + l-G 



and moreover (s, t) G Pairs(C _L , 5) if and only if (n + 1 — t, n + 1 — s) £ Pairs(C, d). Actually, this 
can also be inferred from the proof of the following proposition. 



<TL 



Proposition 2.10. Let a*, . . . , <r* denote the dual basis to a\, . . . , a n , and write f% — u n+] _ l . 
the Ti are the generators described above (up to nonzero scalar multiples). 



Then 



Proof. By duality, we have 5a% = for all / G F, and 5a* = o* h for all [g, h) G Pairs(C). Moreover, 
a* is the trailing term of a*. The proposition now follows, by bookkeeping. □ 

2.8. A remark for the algebraically-minded. According to [3], a persistence module can be 
regarded as a graded module over the ring k[t]. In particular, C can be regarded as a free module 
over k[i] with n generators, ci, . . . , o~ n , where at has degree i. The boundary map d : C — > C is 
then a homomorphism of graded modules. 

The 'dual' of such a module can be taken with respect to the ground field k or the polynomial 
ring k[i]. Thus we can define the global dual 

C°=Hom k[t] (C,k[t]), 

where C° = graded-module homomorphisms C — > k[t] of degree n; 

and the pointwise dual 

C f = Hom k (C,k), 

where C\ = vector-space homomorphisms C_ n —> k. 

These can be regarded as graded k[t]-modules in a natural way. Moreover, the operations — ° and 
— ' are contravariant functors, so in particular the boundary map on C induces boundary maps on 
the new modules. 

The interested reader can verify that 

H(C, d) = persistent absolute homology of X 

H(C , d* ) = persistent absolute cohomology of X 

H(C°, d°) = persistent relative cohomology of X 

H(C° , d°*) = persistent relative homology of X 

up to calibrating the indices. 

3. Matrix Algorithms 

3.1. The boundary matrix. We can represent a filtered cell complex (at least, its homological 
information) by a strictly upper-triangular matrix D, whose entries D[i,j] are the coefficients Dij 
of the boundary map d defined in Section 2.6 Thus the j-th column D[j] = D[..,j] represents do~j. 



With the cells listed in the filtration order, the matrix D also encodes the filtration of the complex. 
Indeed, the top-left square submatrix D[l..i, l..i] is the boundary matrix for X{ = a\ U • • • U ctj, or, 
equivalently, the chain complex C,. Thus D is a representation of the chain complex for persistent 
absolute homology, C*(X) = (C, d). 
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If we flip the matrix D across its minor diagonal, we get the anti-transpo se D ^~, formally 



defined by D [i,j] = D[n + 1 — j, n + 1 — i]. Following the discussion in Section 2.7 we see that 
D L represents the cochain complex for persistent relative cohomology, C*(Xoo,X) = ((C n /C)*,5). 
The top-left submatrix D^-[l..i, l..i] is the coboundary matrix for C*(X (X) ,X n ^i). Indeed, this is 
precisely the full coboundary matrix with entries in X n _j removed. 

It immediately follows that any procedure applied to matrix D that computes the intervals and 
generators of persistent absolute (resp. relative) homology, when applied to matrix D 1 - will give us 
the intervals and generators of persistent relative (resp. absolute) cohomology. 

3.2. Persistence by matrix decomposition. In [8], Cohen-Steiner et al. explain how to view 
the computation of persistent homology as a matrix decomposition problem, finding a factorization 
D = RU, where matrix R is reduced and U is invertible upper-triangular. Here we recap the 
relevant definitions. 

For any matrix A, we define low a(J) to be the index of the lowest non-zero entry in the j'-th 
column of A (that is, the largest index i such that A[i,j] ^ 0); it is undefined if the column is zero. 
We say that matrix R is reduced if low# is injective (over its domain of definition). 

In what follows it is more convenient to look at the inverse of U, matrix V = U . The 
decomposition D = RU becomes R = DV. Whereas neither decomposition is unique, Cohen- 
Steiner et al. [8] show that the map lowjj is. It is precisely this map that gives the persistence 
pairing: a class born at the step g of the filtration dies at the step h iff low/j(/i) = g. 

Suppose we have a decomposition R = DV. If the column R[i] = (so lowij(i) is undefined), 
then the column V[i] is a cycle, by definition. Furthermore, since V is invertible upper-triangular, 
its diagonal entries are non-zero, and V[i] is a cycle that does not exist in Xj_i, i.e. it is exactly the 
cycle born at H„,(Xj). Similarly, if R[j] ^ 0, then it is the cycle that falls in the kernel of the map 
H*(Xj—i) — )■ H*(X,), and V[j] is the chain that appears in Xj and has that cycle as its boundary. 

3.3. Homology generators. To relate the matrix discussion to the algebra of the previous section, 



we observe that the new basis of Theorem 2.6 appears in the matrices R and V . The generator &f 
of the infinite interval [a/,oo) is the column V[f]; the generator b g of the interval [a g ,ah) is the 
column R[h]; the chain &h that kills it is the column V[h]. 

Example. See Figure [T] for the matrices R, D, V of our running example. The map low/j gives 
the absolute homology persistence intervals Pers(H*(S)) = {[1, oo)o, [2, 3)o, [4, 5)i, [6,00)2} where 
the subscript indicates the dimension of the homology class. Columns of the matrices V and R 
give the cycles generating each of the intervals: they are a% = V[l] = a\, &2 = R[S] = o\ — 02, 
<74 = R[5] = CJ3 — (T4, and &q = V[Q] = ctq — 05, respectively. 



In view of Proposition |2.4[ we can also read off the generators for persistent relative homology: 
the column V[f] is the generator for the interval [—00, a/), and the column V[h] is the generator for 
[a g ,ah) (the chain that it represents becomes a relative cycle in H*(X 00 , X g ), and remains nonzero 
until H^ix^Xh)). 

Example. The four intervals of Pers(H*(56, S)) = {[—00, l)o, [2, 3)i, [4,5)2, [—00,6)2} have respec- 
tive generators ij\ = V[l] = cti, 03 = V[3] = 0-3, 0-5 = V[5] = CT5, and &q = V[Q] = ctq — &§. 

3.4. Cohomology generators. We now take the global dual, and consider persistent relative and 
absolute cohomology. This time we compute the decomposition R = D V for the anti-transpose 
D 1 - of D. 

We must take care to track the indices correctly. As a matrix, the rows and columns of D 
are labelled {!,..., n} in the usual order. However, row i and column i refer to cell o" n +i_j in the 
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12 3 4 5 6 




1 

a 



i 

a 



R 

i. ■-> I 3 2 1 



12 3 4 5 6 

1 1 

-1-1 

1 1 
-1-1 



D 



5 4 3 2 1 



12 3 4 5 6 
1 
1 
1-1 
1 
1-1 

1 

V 



6 5 4 3 2 1 



fi 


-1 




-i i 




i 


5 


a 




-i i 




i 


■1 


-i 




-i i 




1 1 


3 


a 




-i i 




i 


2 










1 1 


1 










i 



R A 



D- 



FlGURE 1 . Filtration of a sphere, and the corresponding decompositions R = DV 
and R = D- L V ± . Maps low^ and low R ± are shown with squares around the entries. 
We show only the non-zero entries of the matrices. 

original complex. If we define i* = n + 1 — i, then we can think of the rows and columns as being 
labelled {n* , . . . , 1*}, so that row i* and column i* do indeed refer to cell cij. The numerical labels 
in Figure [l| for D ,R , V should be thought of as starred labels. 

The columns of R and V contain the cocycles of H*(X 00 , X) and the cochains that kill them. 
If low R ±(g*) = h*, then there is a finite interval [a g ,a,h) generated by the cocycle a* h = R-^lg*] 
and killed by the cochain a* = V [g*]. If R [/*] = 0, then there is an infinite interval [— oo,aj) 
generated by at = V^lf*]. 

Example. The persistent relative cohomology Pers(H*(S r 6, S)) has four intervals [—00,6)2, [4,5)2, 



V ± [6* 



CT K , Or. 



^[4* 



2- 



[2,3)i, [—co,l)o, generated respectively by a\ 
R^[2*} = -al-a* 2 , a\ =V J -[1*] =<j\+<j* 2 . 

Finally, we can read off the generators for persistent absolute cohomology H*(X): when R [/*] = 
0, the column V [/*] is the cocycle which generates the interval [—00, a/); and when low R ±r gt \ = h* 
the column ^[g*] is the cocycle which generates [a 9 ,a/j). 

Example. The persistent absolute cohomology Pers(H*(S)) has four intervals [6, +00)2, [4, 5)i 
[2,3)o, [1, +00)0 generated respectively by a* 6 = V 1 ^*} = ag, a\ = V 1 ^*] = at, a\ = V^[2*} = aT 
a* = V ± [l*] = al + a%. 

3.5. Column and row algorithms. The original persistence algorithm [HE] finds the pairing by 
processing matrix D column-by-column to obtain the reduced matrix R. In the context of R = DV 
decomposition, one can express it as: 

Algorithm 1 Column algorithm pHcol. 

R = D;V = 1 
for i = 1 to n do 

while 3 j < i with low^(j') = low^(i) do 
c = R[i] [low R i]/R[j] [low R j] 
R[i] = R[i] - cR[j] 
V[i] = V[i\ - cR[j] 



Here the definition of the constant c ensures that the lowest non-zero element in column i moves 
up after each iteration of the while loop. The condition of the while loop immediately implies 
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that matrix R is reduced when the algorithm terminates. Furthermore, since we perform identical 
updates on R and V, we get an R = DV decomposition. 

The algorithm pHcol is essentially Gaussian elimination performed using column operations. 
More commonly one would use column operations, processing the matrix row- by-row from the 
bottom up: 

Algorithm 2 Row algorithm pHrow. 

R = D;V = 1 

for i = n down to 1 do 

indices = [j \ low R (j) = i] # lows in the row i of R 

p = indices[0] # pivot 

for j G indicesfl..] do 

c = R[j][low R j]/R[p][low R p\ 

R[j] = R[j] - cR\p] 

V[j] = V\j] ~ cV[ P ] 

It is not difficult to see that this algorithm also produces an R = DV decomposition where matrix 
R is reduced, and matrix V is invertible upper-triangular. What is less immediate is that the 
two algorithms produce identical decompositions, so we prove this fact formally. (Notice that the 
statement would not be true if, during pHrow, we tried to cancel all non-zero elements in row i 
of R, rather than restricting attention to the columns picked out by indices.) 

Theorem 3.1 (Identical Output Theorem). The decompositions R c = DV C and R r = DV r produced 
by column and row algorithms respectively are identical, i.e. R c = R r and V c = V r . 

Proof. We observe that once it determines the lowest non-zero element in a given column of matrix 
R, neither algorithm changes that column in any subsequent operations. Given a matrix R = D 
we prove the claim by induction. The first column with the lowest non-zero entry in R is not 
modified by either algorithm. Suppose that the columns with the lowest non-zero entries below i 
are identical in both R c and R r , and V c and V r . During the computation of the column with the 
lowest non-zero entry in row i we add columns with low^ > i in a decreasing order dictated by 
the lowest non-zero entry of the column. Since the order and the columns are identical, so is the 
result. □ 

Remark. Recently Milosavljevic, Morozov, and Skraba [llj showed that one can compute persistence 
in matrix multiplication time. 

Remark. One can apply the two algorithms of this section to the restricted matrix D p that gives 
only the boundaries of the p-dimensional cells. We can still extract some information from the 
R p = DpVp decomposition of this matrix: the finite intervals [g, h) in dimension p — 1 and the 
births in dimension p, i.e. the endpoints g or h of any p-dimensional interval. 

4. Optimizations 

4.1. Cohomology algorithm. One of our goals has been to relate our present work to an algo- 
rithm pCoh for persistent absolute cohomology that we described in [6]. We based that algorithm 
on the idea of maintaining a right filtration (defined in |12j); as a result it looks different from 
pHcol and pHrow above. In fact, we now show that one can view pCoh as an optimization of pHrow 
applied to the matrix D . We begin by reviewing the algorithm: 
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Algorithm 3 Cohomology algorithm pCoh. 



Z-L = [], birth = [] 
for i = 1 to n do 

indices = [j | a* £ 5z*,z* unmarked in Z ] 
if indices are empty then 

prepend a\ to Z and % to birth 
else 

prepend a marked a* to Z -1 - and i to birth 

p = indices[0] 

for j = 1 to size(indices) do 

c = (5Z ± [indices[j]])[i]/(5Z ± [p])[i] 
^[indicesfj]] = ^[indices^']] - cZ L \p\ 
mark Z L \p] and output the pair [birth[p],z) 




















\ 






R 1 ^ D 1 - V L 

FlGURE 2. The structure of matrices R = D V during the execution of the row algorithm. 



List Z maintains the cocycle basis for H*(Xi) in the right filtration order dictated by the nitration 
of the space. The marking above is for exposition only, in practice we drop a cocycle from the list 
Z as soon as it dies. When a new cell at enters, it is necessarily a cocycle (since it has no cofaces), 
but it may fall into a coboundary of a former cocycle, in which case (the else clause) we update the 
right filtration and drop the cocycle that <7j kills. 

To see that this algorithm is a variation of the row algorithm from the previous section, observe 
that the cocycles that it maintains are stored in the bottom-right corner of matrix V 1 - during the 
execution of the row algorithm. 

Claim 4.1. The matrix Z 1 - in the cohomology algorithm after iteration i is equal to the bottom-right 
corner of the matrix V^Kn — i)..n, (n — i)..n] after the i-th iteration of the row algorithm. 

Proof. We prove the claim inductively. Denoting with Rf- , Vf~ , Zf- the various matrices after i 
iterations of both algorithms, assume the unmarked cocycles z| in Z^~ are exactly the cocycles with 
low^x(j) > i. In other words, the corresponding columns Rj- [j] = 0. Furthermore assume that the 
two matrices are identical, i.e. V^ = Z^-. The claim is true when i = 0. Our goal is to show it is 
true for i = k assuming it is true for i = k — 1. 

At the fc-th iteration, if cell a\ does not appear in the coboundary of any cocycle, then its row 
in Rfr_i = <^fc_i = ^^k-i 1S zero - It follows that it is not in the image of the map low^x and 

therefore neither algorithm performs any changes, so V^~ = Z^, and unmarked cocycles remain as 
claimed. 

If cell a* k is in the coboundary of a cocycle z* then k is in the image imlow R ± . Moreover, from 

the inductive hypothesis the indices j of the columns of Rt_i that have low R ± (j) = i are exactly 
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the unmarked cocycles in Z^-_ 1 that have a\ in their coboundary. Therefore, the update performed 
by both algorithms is identical. □ 

Remark. Since the matrix R contains the final persistence pairing, expressed as the map low#, the 
algorithm pHcol is commonly optimized to keep track only of this matrix (and ignore matrix V). 
In contrast, pCoh maintains only matrix Z 1 - = V^. 

4.2. Practice. The algorithm pCoh above highlights the difference between the column and the 
row versions of the persistence algorithm. pHcol stores all the dead cycles since it has no choice: 
any of them might be required at some future point in the reduction. pHrow, on the other hand, 
is able to 'examine the future' by inspecting any chosen row. It is therefore free to drop a column 
once it has determined its pairing and used it in the update. pCoh does so explicitly. 

In practice, such row access may be difficult when computing homology: it requires quick ac- 
cess to the coboundary of a given cell (since that is what a row of D is). In simplicial complex 
implementations it is common to represent simplices as lists of vertices; then their boundary maps 
are easy to compute on the fly, while their coboundaries require a full preprocessing of the entire 
boundary matrix. By switching to cohomology we turn the tables: all the primitives necessary 
for the row algorithm (and in particular the optimized version given in this section) are readily 
available. 

4.3. Experiments. The practical improvement resulting from these observations is startling. In 
the following table we compare the traditional persistent homology algorithm pHcol with the coho- 
mology algorithm pCoh. We list the total number of operations performed (in terms of primitive 
operations during chain arithmetic), total running time, and peak space usage in terms of the 
number of elements stored. 



Dataset 


Algorithm 


Operations 


Time 


Peak elements 


M-50 


pCoh 
pHcol 


2,171,909,275 
609,477,028,616 


106 s 
4160 s 


575,758 
6,461,866 


T-10,000 


pCoh 
pHcol 


55,930,317 
29,760,159,689 


6 s 
207 s 


22,629 
693,031 



We used the C++ library Dionysus [13] to perform the above experiments. The homology algo- 
rithm pHcol in the above table computes only the matrix R since it suffices to extract the barcode. 
It also uses the original optimization of [1] and stores the non-zero coefficients only in the rows that 
correspond to the positive cells. M-50 is a filtration of an 8-skeleton of a Rips complex built on 
50 random points of a Mumford dataset |14| 115] up to the maximum pairwise distance of 1.5; the 
largest complex consists of 663,901 simplices. T-10,000 is an alpha shape filtration of 10,000 points 
sampled on a torus embedded in R 3 ; the size of the Delaunay triangulation is 557,727 simplices. 
The speed-up is encouraging. We would like to point out that these examples are not cherry-picked: 
we have yet to find a filtration on which pHcol is the faster of the two. 

Conclusion. When combined, the algebraic and experimental observations suggest that if given 
a choice, one is better off using the cohomology algorithm. Most of the time one has such a choice: 
for example, when computing only the persistence diagram. 
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